Search CORE

733 research outputs found

Algorithms and Adaptivity Gaps for Stochastic k-TSP

Author: Jiang Haotian
Li Jian
Liu Daogao
Singla Sahil
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 06/11/2019
Field of study

Given a metric

(V,d)

and a

\textsf{root} \in V

, the classic \textsf{k-TSP} problem is to find a tour originating at the

\textsf{root}

of minimum length that visits at least

k

nodes in

V

. In this work, motivated by applications where the input to an optimization problem is uncertain, we study two stochastic versions of \textsf{k-TSP}. In Stoch-Reward

k

-TSP, originally defined by Ene-Nagarajan-Saket [ENS17], each vertex

v

in the given metric

(V,d)

contains a stochastic reward

R_v

. The goal is to adaptively find a tour of minimum expected length that collects at least reward

k

; here "adaptively" means our next decision may depend on previous outcomes. Ene et al. give an

O(\log k)

-approximation adaptive algorithm for this problem, and left open if there is an

O(1)

-approximation algorithm. We totally resolve their open question and even give an

O(1)

-approximation \emph{non-adaptive} algorithm for this problem. We also introduce and obtain similar results for the Stoch-Cost

k

-TSP problem. In this problem each vertex

v

has a stochastic cost

C_v

, and the goal is to visit and select at least

k

vertices to minimize the expected \emph{sum} of tour length and cost of selected vertices. This problem generalizes the Price of Information framework [Singla18] from deterministic probing costs to metric probing costs. Our techniques are based on two crucial ideas: "repetitions" and "critical scaling". We show using Freedman's and Jogdeo-Samuels' inequalities that for our problems, if we truncate the random variables at an ideal threshold and repeat, then their expected values form a good surrogate. Unfortunately, this ideal threshold is adaptive as it depends on how far we are from achieving our target

k

, so we truncate at various different scales and identify a "critical" scale.Comment: ITCS 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Forward and Inverse Approximation Theory for Linear Temporal Convolutional Networks

Author: Jiang Haotian
Li Qianxiao
Publication venue
Publication date: 29/05/2023
Field of study

We present a theoretical analysis of the approximation properties of convolutional architectures when applied to the modeling of temporal sequences. Specifically, we prove an approximation rate estimate (Jackson-type result) and an inverse approximation theorem (Bernstein-type result), which together provide a comprehensive characterization of the types of sequential relationships that can be efficiently captured by a temporal convolutional architecture. The rate estimate improves upon a previous result via the introduction of a refined complexity measure, whereas the inverse approximation theorem is new

arXiv.org e-Print Archive

Approximation theory of transformer networks for sequence modeling

Author: Jiang Haotian
Li Qianxiao
Publication venue
Publication date: 29/05/2023
Field of study

The transformer is a widely applied architecture in sequence modeling applications, but the theoretical understanding of its working principles is limited. In this work, we investigate the ability of transformers to approximate sequential relationships. We first prove a universal approximation theorem for the transformer hypothesis space. From its derivation, we identify a novel notion of regularity under which we can prove an explicit approximation rate estimate. This estimate reveals key structural properties of the transformer and suggests the types of sequence relationships that the transformer is adapted to approximating. In particular, it allows us to concretely discuss the structural bias between the transformer and classical sequence modeling methods, such as recurrent neural networks. Our findings are supported by numerical experiments

arXiv.org e-Print Archive

Natural Graph Wavelet Packet Dictionaries

Author: Cloninger Alexander
Li Haotian
Saito Naoki
Publication venue
Publication date: 12/03/2021
Field of study

We introduce a set of novel multiscale basis transforms for signals on graphs that utilize their "dual" domains by incorporating the "natural" distances between graph Laplacian eigenvectors, rather than simply using the eigenvalue ordering. These basis dictionaries can be seen as generalizations of the classical Shannon wavelet packet dictionary to arbitrary graphs, and do not rely on the frequency interpretation of Laplacian eigenvalues. We describe the algorithms (involving either vector rotations or orthogonalizations) to construct these basis dictionaries, use them to efficiently approximate graph signals through the best basis search, and demonstrate the strengths of these basis dictionaries for graph signals measured on sunflower graphs and street networks

arXiv.org e-Print Archive

eScholarship - University of California